Search CORE

174 research outputs found

Ellogon: A New Text Engineering Platform

Author: Androutsopoulos Ion
Karkaletsis Vangelis
Paliouras Georgios
Petasis Georgios
Spyropoulos Constantine D.
Publication venue
Publication date: 01/01/2002
Field of study

This paper presents Ellogon, a multi-lingual, cross-platform, general-purpose text engineering environment. Ellogon was designed in order to aid both researchers in natural language processing, as well as companies that produce language engineering systems for the end-user. Ellogon provides a powerful TIPSTER-based infrastructure for managing, storing and exchanging textual data, embedding and managing text processing components as well as visualising textual data and their associated linguistic information. Among its key features are full Unicode support, an extensive multi-lingual graphical user interface, its modular architecture and the reduced hardware requirements.Comment: 7 pages, 9 figures. Will be presented to the Third International Conference on Language Resources and Evaluation - LREC 200

arXiv.org e-Print Archive

CiteSeerX

Tensor Factorization with Label Information for Fake News Detection

Author: Katsimpras Georgios
Paliouras Georgios
Papanastasiou Frosso
Publication venue
Publication date: 11/08/2019
Field of study

The buzz over the so-called "fake news" has created concerns about a degenerated media environment and led to the need for technological solutions. As the detection of fake news is increasingly considered a technological problem, it has attracted considerable research. Most of these studies primarily focus on utilizing information extracted from textual news content. In contrast, we focus on detecting fake news solely based on structural information of social networks. We suggest that the underlying network connections of users that share fake news are discriminative enough to support the detection of fake news. Thereupon, we model each post as a network of friendship interactions and represent a collection of posts as a multidimensional tensor. Taking into account the available labeled data, we propose a tensor factorization method which associates the class labels of data samples with their latent representations. Specifically, we combine a classification error term with the standard factorization in a unified optimization process. Results on real-world datasets demonstrate that our proposed method is competitive against state-of-the-art methods by implementing an arguably simpler approach.Comment: Presented at the Workshop on Reducing Online Misinformation Exposure ROME 201

arXiv.org e-Print Archive

Evaluation Measures for Hierarchical Classification: a unified view and novel approaches

Author: Androutsopoulos Ion
Gaussier Eric
Kosmopoulos Aris
Paliouras Georgios
Partalas Ioannis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2013
Field of study

Hierarchical classification addresses the problem of classifying items into a hierarchy of classes. An important issue in hierarchical classification is the evaluation of different classification algorithms, which is complicated by the hierarchical relations among the classes. Several evaluation measures have been proposed for hierarchical classification using the hierarchy in different ways. This paper studies the problem of evaluation in hierarchical classification by analyzing and abstracting the key components of the existing performance measures. It also proposes two alternative generic views of hierarchical evaluation and introduces two corresponding novel measures. The proposed measures, along with the state-of-the art ones, are empirically tested on three large datasets from the domain of text classification. The empirical results illustrate the undesirable behavior of existing approaches and how the proposed methods overcome most of these methods across a range of cases.Comment: Submitted to journa

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

Modeling Web Navigation using Grammatical Inference

Author: Georgios Korfiatis
Georgios Paliouras
Publication venue
Publication date: 05/03/2020
Field of study

Abstract In this paper, a method that models user navigation on the Web, as opposed to a single Web site, is presented, aiming to assist the user by recommending pages. User modeling is done through data mining of Web usage logs, resulting in aggregate, rather than personal models. The proposed approach extends Grammatical Inference methods, by introducing an extra merging criterion, which examines the semantic similarity of automaton states. The experimental results showed that the method does indeed facilitate the modeling of Web navigation, which was not possible with the existing Web usage mining methods. However, a content-based recommendation model is shown to still outperform the proposed method, which suggests that the knowledge of the navigation sequence does not contribute to the recommendation process. This is due to the thematic cohesion of navigation sessions, in comparison to the large thematic diversity of Web usage data. Among three variants of the proposed method, the one based on Blue Fringe, that examines a larger space of possible merges, performs better

CiteSeerX